Skip to content

PYTHON-5253 Automated Spec Test Sync #2409

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 50 commits into
base: master
Choose a base branch
from

Conversation

sleepyStick
Copy link
Contributor

@sleepyStick sleepyStick commented Jun 25, 2025

Generates at most one weekly PR on Monday's at 9am pst (assuming the evergreen hosts are in utc time...).

At a high level, I've written a script that simply loops through pymongo's existing spec tests and calls the existing bash resync-spec.sh to re-sync all the specs. Then if changes were made, a PR would be created. If all specs are up to date, then no PR will be made.

This PR will be created by mongodb-drivers-pr-bot with the changes. If an error occurred during the syncing of a spec, the PR description will display the name of the spec along with stderr from the bash resync-spec.sh <spec> command.

An example of a generated PR #2407

Copy link
Contributor Author

@sleepyStick sleepyStick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some comments that I think might help the reviewer :)

cd $SPECS/source
make
cd -
if ! [ -n "${CI:-}" ]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI is always defined in an EVG run and we only want to run these things in a local checkout.

#!/usr/bin/env bash

tools="../drivers-evergreen-tools"
git clone https://github.com/mongodb-labs/drivers-evergreen-tools.git $tools
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cloning drivers evergreen tools because there are some scripts (mainly ./secrets-export.sh and get-access-token.sh that are there and i'd rather not copy it over)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we already have a script to clone this repo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do? I didn't know that. Do you know what it's called?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.evergreen/scripts/configure-env.sh

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing it out! I just removed the clone here and found the existing path for it!

and len(list(os.scandir(pathlib.Path(entry.path) / "tests"))) > 1
}
test_set = {entry.name.replace("-", "_") for entry in os.scandir(directory) if entry.is_dir()}
known_mappings = {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a slight difference in how we name the directories vs how the spec names them sometimes? (Not 100% sure if these are correct though, so please let me know if these need to be changed!)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we often have different names than the spec repo, that's expected.

if spec.name in ["asynchronous"]:
continue
process = subprocess.run(
["bash", "./.evergreen/resync-specs.sh", spec.name], # noqa: S603, S607
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note:

S603 `subprocess` call: check for execution of untrusted input
S607 Starting a process with a partial executable path

S603: I think spec.name is safe because they're only acquired from our directories within test
S607: I think bash is sufficient for executable (they want the whole path)

errored[spec.name] = process.stderr

process = subprocess.run(
["git diff --name-only | awk -F'/' '{print $2}' | sort | uniq"], # noqa: S607
Copy link
Contributor Author

@sleepyStick sleepyStick Jun 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note:

S607 Starting a process with a partial executable path

I tried to separate the string into ["git", "diff --name-only | awk -F'/' '{print $2}' | sort | uniq"] but that caused the command to fail? So I kept it as a whole string..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each space-separated word has to be a separate arg, keeping it as one here is probably more readable.

@@ -0,0 +1,1327 @@
diff --git a/test/load_balancer/cursors.json b/test/load_balancer/cursors.json
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a file of known spec test differences that we don't want to be committed. If any one of these is incorrect, or I'm missing some, lmk!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a section to CONTRIBUTING.md detailing the why and how of modifying this file. It should also contain the policy we follow summarized by Shane in the Slack thread.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Just added a blurb, but lmk if the language is unclear or anything like that. (I feel like my technical writing skills are lacking sometimes)

@sleepyStick sleepyStick marked this pull request as ready for review June 26, 2025 17:04
Copy link
Contributor

@NoahStapp NoahStapp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! Some minor comments.

["bash", "./.evergreen/resync-specs.sh", spec.name], # noqa: S603, S607
capture_output=True,
text=True,
check=False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the motivation behind check=False here and then checking the returncode explicitly instead of using check=True?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

err no motivation (I learned about check when ruff yelled at me for it so) Changed it to check=True inside of a try catch!

errored[spec.name] = process.stderr

process = subprocess.run(
["git diff --name-only | awk -F'/' '{print $2}' | sort | uniq"], # noqa: S607
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each space-separated word has to be a separate arg, keeping it as one here is probably more readable.

and len(list(os.scandir(pathlib.Path(entry.path) / "tests"))) > 1
}
test_set = {entry.name.replace("-", "_") for entry in os.scandir(directory) if entry.is_dir()}
known_mappings = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we often have different names than the spec repo, that's expected.

@sleepyStick sleepyStick requested a review from NoahStapp June 27, 2025 16:09
@@ -0,0 +1,1327 @@
diff --git a/test/load_balancer/cursors.json b/test/load_balancer/cursors.json
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a section to CONTRIBUTING.md detailing the why and how of modifying this file. It should also contain the policy we follow summarized by Shane in the Slack thread.

@sleepyStick sleepyStick requested a review from a team as a code owner June 27, 2025 19:00
@sleepyStick sleepyStick requested a review from NoahStapp June 27, 2025 19:27
@sleepyStick sleepyStick requested a review from blink1073 July 11, 2025 17:01
@sleepyStick
Copy link
Contributor Author

It looks like some tests got unskipped by accident, for example:

spruce.mongodb.com/task/mongo_python_driver_mongodb_v4.4_test_server_version_pypy3.10_async_auth_ssl_sharded_cluster_patch_244f17d57b1a28b06756b026372e27bc07b61e67_68713fbc31f3100007df7822_25_07_11_16_45_51/tests?execution=0&sorts=STATUS%3AASC

image

Oh I think I just need to merge master into this branch? We'll see if that fixes it

@blink1073
Copy link
Member

I opened https://jira.mongodb.org/browse/PYTHON-5439 for the link failure

blink1073
blink1073 previously approved these changes Jul 11, 2025
Copy link
Member

@blink1073 blink1073 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@sleepyStick sleepyStick requested a review from ShaneHarvey July 11, 2025 20:23
Copy link
Contributor

@NoahStapp NoahStapp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! Added several suggestions for the CONTRIBUTING.md documentation section to ensure active voice and overall consistency.

#!/bin/bash
PYMONGO=$(dirname "$(cd "$(dirname "$0")" || exit; pwd)")

rm -rf $PYMONGO/test/server_selection/logging
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should have a comment explaining why we remove this directory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took this line from resync-spec.sh (moved it here for consistency) but there was no comment in resync-spec.sh for why its removed..do you happen to know why?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We put those tests in test/server_selection_logging instead.

token=$(bash ./get-access-token.sh $repo $owner)
if [ -z "${token}" ]; then
echo "Failed to get github access token!"
popd || exit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're going to exit anyway, why popd first?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point....turns out i had originally just done a popd and then exit 1, but then my pycharm linter was like "yo, what if popd fails? you should add an || exit to be safe" so i did LOL but its kinda unnecessary cuz I know I did a pushd at the beginning of the script.
anyways, i removed the || exit

Copy link
Contributor Author

@sleepyStick sleepyStick Jul 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait jk, its not just pycharm linter, our liters say that too...(but it was fine when i ran it locally? 😕)
edit: i realized i didn't actually answer your question...i wanna popd before i exit so that the resync-all-specs.sh is at the path it expects to be on (i delete a file that the script generated to "cleanup" after calling this create-pr script)

Co-authored-by: Noah Stapp <[email protected]>
@sleepyStick sleepyStick requested a review from NoahStapp July 14, 2025 16:31
Copy link
Contributor

@NoahStapp NoahStapp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment needed for test/server_selection_logging.

@sleepyStick
Copy link
Contributor Author

Comment needed for test/server_selection_logging.

Moved that line back to resync-spec.sh and added a comment!

@sleepyStick sleepyStick requested a review from NoahStapp July 14, 2025 17:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants